NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Federated target trial emulation using distributed observational data for treatment effect estimation

https://doi.org/10.1038/s41746-025-01803-y

Li, Haoyang; Zang, Chengxi; Xu, Zhenxing; Pan, Weishen; Rajendran, Suraj; Chen, Yong; Wang, Fei (December 2025, npj Digital Medicine)

Free, publicly-accessible full text available December 1, 2026
Multicenter target trial emulation to evaluate corticosteroids for sepsis stratified by predicted organ dysfunction trajectory

https://doi.org/10.1038/s41467-025-59643-z

Rajendran, Suraj; Xu, Zhenxing; Pan, Weishen; Zang, Chengxi; Siempos, Ilias; Torres, Lisa; Xu, Jie; Bian, Jiang; Schenck, Edward J; Wang, Fei (December 2025, Nature Communications)

Free, publicly-accessible full text available December 1, 2026
High-throughput target trial emulation for Alzheimer’s disease drug repurposing with real-world data

https://doi.org/10.1038/s41467-023-43929-1

Zang, Chengxi; Zhang, Hao; Xu, Jie; Zhang, Hansi; Fouladvand, Sajjad; Havaldar, Shreyas; Cheng, Feixiong; Chen, Kun; Chen, Yong; Glicksberg, Benjamin S; et al (December 2023, Nature Communications)

Target trial emulation is the process of mimicking target randomized trials using real-world data, where effective confounding control for unbiased treatment effect estimation remains a main challenge. Although various approaches have been proposed for this challenge, a systematic evaluation is still lacking. Here we emulated trials for thousands of medications from two large-scale real-world data warehouses, covering over 10 years of clinical records for over 170 million patients, aiming to identify new indications of approved drugs for Alzheimer’s disease. We assessed different propensity score models under the inverse probability of treatment weighting framework and suggested a model selection strategy for improved baseline covariate balancing. We also found that the deep learning-based propensity score model did not necessarily outperform logistic regression-based methods in covariate balancing. Finally, we highlighted five top-ranked drugs (pantoprazole, gabapentin, atorvastatin, fluticasone, and omeprazole) originally intended for other indications with potential benefits for Alzheimer’s patients.
more » « less
Full Text Available
Building the Model

https://doi.org/10.5858/arpa.2021-0635-RA

Yang, He S.; Rhoads, Daniel D.; Sepulveda, Jorge; Zang, Chengxi; Chadburn, Amy; Wang, Fei (October 2022, Archives of Pathology & Laboratory Medicine)

Context.— Machine learning (ML) allows for the analysis of massive quantities of high-dimensional clinical laboratory data, thereby revealing complex patterns and trends. Thus, ML can potentially improve the efficiency of clinical data interpretation and the practice of laboratory medicine. However, the risks of generating biased or unrepresentative models, which can lead to misleading clinical conclusions or overestimation of the model performance, should be recognized. Objectives.— To discuss the major components for creating ML models, including data collection, data preprocessing, model development, and model evaluation. We also highlight many of the challenges and pitfalls in developing ML models, which could result in misleading clinical impressions or inaccurate model performance, and provide suggestions and guidance on how to circumvent these challenges. Data Sources.— The references for this review were identified through searches of the PubMed database, US Food and Drug Administration white papers and guidelines, conference abstracts, and online preprints. Conclusions.— With the growing interest in developing and implementing ML models in clinical practice, laboratorians and clinicians need to be educated in order to collect sufficiently large and high-quality data, properly report the data set characteristics, and combine data from multiple institutions with proper normalization. They will also need to assess the reasons for missing values, determine the inclusion or exclusion of outliers, and evaluate the completeness of a data set. In addition, they require the necessary knowledge to select a suitable ML model for a specific clinical question and accurately evaluate the performance of the ML model, based on objective criteria. Domain-specific knowledge is critical in the entire workflow of developing ML models.
more » « less
Full Text Available
SCEHR: Supervised Contrastive Learning for Clinical Risk Prediction using Electronic Health Records

https://doi.org/10.48550/arXiv.2110.04943

Zang, Chengxi; Wang, Fei (January 2021, ArXivorg)

Full Text Available
Neural Dynamics on Complex Networks

https://doi.org/10.1145/3394486.3403132

Zang, Chengxi; Wang, Fei (August 2020, Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining)
null (Ed.)
Full Text Available
MoFlow: An Invertible Flow Model for Generating Molecular Graphs

https://doi.org/10.1145/3394486.3403104

Zang, Chengxi; Wang, Fei (August 2020, Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining)
null (Ed.)
Full Text Available
Recent Advances on Graph Analytics and Its Applications in Healthcare

https://doi.org/10.1145/3394486.3406469

Wang, Fei; Cui, Peng; Pei, Jian; Song, Yangqiu; Zang, Chengxi (August 2020, Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining)
null (Ed.)
Full Text Available
Contrastive learning improves critical event prediction in COVID-19 patients

https://doi.org/10.1016/j.patter.2021.100389

Wanyan, Tingyi; Honarvar, Hossein; Jaladanki, Suraj K.; Zang, Chengxi; Naik, Nidhi; Somani, Sulaiman; De Freitas, Jessica K.; Paranjpe, Ishan; Vaid, Akhil; Zhang, Jing; et al (December 2021, Patterns)

Full Text Available
Dynamical Origins of Distribution Functions

https://doi.org/10.1145/3292500.3330842

Zang, Chengxi; Cui, Peng; Zhu, Wenwu; Wang, Fei (July 2019, Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining)

Full Text Available

« Prev Next »

Search for: All records